Empirical Research on the Prediction Effect of Volatility Model Based on the Perspective of Investor Sentiment Health and Market Liquidity

Healthy investor sentiment has an important impact on stock market volatility. To avoid extreme volatility in the stock market, it becomes crucial to focus on the sentiment health of investors. Therefore, we establish the realized volatility model, the implied volatility model, and the historical volatility model, respectively, introduce investor sentiment and market liquidity into these models, and empirically study the forecast eﬀects of future daily and weekly frequency volatility to play a role in the development of the physical industry and the improvement of organizational performance. The study found that the HAR-RV model has the strongest predictive ability, while the GJR-GARCH model’s predictive eﬀect is not ideal; market liquidity and high investor sentiment can improve the predictive eﬀect of most models; compared with long-term (one-week) volatility, the eﬀect of predicting short-term (one-day) volatility is better.


Introduction
As one of the most important derivatives, options have gradually developed into mainstream products for global transactions. e 50 leading stocks with the largest scale and influence on China's Shanghai Stock Exchange above the Shanghai Stock Exchange constitute the sample stocks and become one of the most active investment indexes in the Chinese stock market. On February 9, 2015, SSE 50ETF options were officially listed, marking the official arrival of the Chinese options era. Since then, the trading volume of options has expanded rapidly and has gradually become an extremely important risk management tool in China's financial system. erefore, topics including the prediction of option price volatility are hot topics of continuous discussion in academia and industry to play a role in the development of the physical industry and the improvement of organizational performance. e prediction of volatility has always been the central content of derivatives trading research. Volatility can reflect the uncertainty of price changes and play an important role in risk supervision, product pricing, and investment portfolio selection. Early volatility forecasting research mainly focused on low-frequency historical data such as daily frequency and used GARCH family models to make predictions. In recent years, with the development of information technology, academia has begun to pay attention to the use of high-frequency data to predict volatility. Corsi et al. empirically found that the realized volatility has stronger pricing power for S&P500 options [1], and existing studies have found that implied volatility contains important prediction and decision-making information for market participants, which can greatly improve the effect of predicting volatility. For example, Kambouroudis et al. incorporated implied volatility, realized volatility, and historical volatility into one model at the same time, proving the ability of implied volatility to improve forecast accuracy [2].
At present, some studies have shown that investor sentiment health can have a significant impact on market volatility. Renault [3] took the S&P500 ETF as the object and found that the effect of predicting long-term volatility has changed after introducing investor sentiment. Zhang et al. [4] also empirically found that the HAR model has greatly enhanced its explanatory power after introducing investor sentiment. Yang and Wang [5] found that investor sentiment can significantly improve the predictive effect of the volatility model. Qu and Shen [6] proved that the predictive effect of the volatility model is particularly significant when high investor sentiment health is high. erefore, we incorporate investor sentiment indicator into the volatility prediction models to test the impact of high or low investor sentiment on the prediction effect of volatility. e impact of market liquidity on the forecasting effect of future volatility is also the focus of this article. Market liquidity is one of the important indicators to measure the efficiency of market operations. Song et al. [7] proved that market liquidity can significantly affect market volatility. Yao [8], based on the TVP-SV-SVAR model, empirically found that market liquidity is a one-way Granger causality of financial market stability, which significantly affects the future trend of market volatility. erefore, we will include market liquidity indicator in the volatility models to empirically test whether high or low market liquidity affects the accuracy of predicting volatility.
Considering that China has hundreds of option products based on the SSE 50 index, it provides a rich source of data for studying volatility forecasting issues. erefore, we choose SSE 50 stocks, SSE 50ETF index, and SSE 50ETF options as the research objects and establish HAR-RV model, ARMA-IV type model, and GJR-GARCH model, respectively, to study the realized volatility, implied volatility, and historical volatility. In addition, we construct a comprehensive indicator of market liquidity and investor sentiment and introduce them into those volatility models.
Compared with the existing research, the marginal contribution of our paper mainly has the following points: first, although part of the previous literature empirically analyses the impact of investor sentiment on volatility forecasting, but few literatures systematically introduce investor sentiment into three models for analysis and discussion. Second, we not only horizontally compare the prediction effects of realized volatility, historical volatility, and implied volatility but also longitudinally compare the effects of three kinds of volatility in predicting short-term (one-day) and long-term (one-week) volatility.

Market Liquidity Indicator.
Due to the scale differences between different stocks, we refer to Yin and Wu et al.'s method [9] of constructing market liquidity indicators. We use the weight-based transaction amount indicator (WA) to measure market liquidity, and the indicator WA can be expressed as where Amount i,t and Q i,t are, respectively, the trading value and total market value of the ith stock at time t, and N t is the market stock trading volume at time t.

Investor Sentiment
Indicator. e measurement of investor sentiment indicators is usually classified into three categories: direct indicators, indirect indicators, and comprehensive indicators. Among them, direct indicators are directly obtained through surveys and other methods. Indirect indicators are obtained indirectly through objective analysis of market data. e comprehensive index is calculated through a series of model methods such as principal component analysis, such as the BW sentiment comprehensive index constructed by Baker and Wurgler [10].
Taking into account the actual situation of the SSE 50 market and the availability of data, we adopt the symbolbased turnover rate (ATurnover), the number of price limits (NF), the realized skewness (RSkew), the symbol-based jump (SJV), A·D·line (ADLine), and momentum effect index (MTM), using principal component analysis to construct a comprehensive indicator of investor sentiment (SENT), and the larger the SENT value, the more active the investor sentiment. e specific content of the proxy index is shown in Table 1.
en, we perform standardization processing, the KMO test, and the Bartlett sphericity test. e results show that the value of the KMO test is 0.61, and the significance value of the Bartlett's sphericity test is 0.00, indicating that the above proxy indicators are suitable for principal component analysis.
According to Table 2, the eigenvalues of the first three principal components are all greater than 1, and the cumulative variance explanation rate has reached 70.663%, which can explain the emotional state of market investors with little loss of information. erefore, we select the first three principal components weighted construction investor sentiment index (SENT): SENT t � ATurnover t × 0.269 2.3. Out-of-Sample Predictive Indicator. We choose three loss functions of mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE) to test the out-of-sample prediction effect of the model. e smaller the value is, the better the prediction effect is, and their specific expressions are as follows: where σ 2 t and σ 2 t are respectively the forecast volatility and actual forecast volatility at time t.

Realized Volatility Model (HAR-RV Model).
Assume that the logarithm of asset prices obeys Brownian motion: where p t is the logarithm of the instantaneous asset price at time t, μ t is the average logarithm of the asset's instantaneous price at time t, σ t is the instantaneous volatility of asset price logarithm at time t, and ω t is the standardized wiener process at time t. According to the limit theory, the volatility on day t is defined as the instantaneous volatility integral of one day: When the number of samples tends to infinity, the realized volatility (RV) is a consistent estimator of the actual volatility, and the expression is where r i,t is the logarithm of the instantaneous asset price at time t, and w is the total number of logarithmic rates of return in a day. In addition, compared with other volatility indicators, the realized volatility (RV) is calculated from high-frequency data, which can greatly reduce the interference of regression noise. erefore, we use realized volatility (RV) to predict the actual volatility of the model. In the same way, the realized volatility per week is where h � 5 is the weekly realized volatility. Based on the above calculation method, this paper uses the heteroscedastic autoregressive model (HAR-RV model) as the realized volatility model, and its specific form is where h � 1 and h � 5 are respectively the realized volatility model of daily frequency (short-term) and weekly frequency (long-term). ATurnover ATurnover t � Turnover t × (R t /|R t |) e frequency of reselling stocks in a certain period e continuation of the return's rate from the original trend Note: R t is the market rate of return at time t; NZF t and N DF t are respectively the number of limit-up and limit-down stocks at time t; R i,t is the market rate of return of the ith stock at time t; I [·] is the indicative function, when [·] is true; I [·] is 1; otherwise, it is 0; Z t and D t are respectively the number of market shares rising and falling at time t; and P 2 i,t and P 1 i,t are respectively the closing price and opening price of the ith stock at time t.
where C t is the call option price at time t, P t is the put option price at time t, X t is the strike price of the option at time t, S t is the exercise price of the option at time t, r t is the risk-free interest rate at time t, and σ 1 t and σ 2 t are respectively the implied volatility of call and put options.
In summary, we refer to the method of Yang and Ma [11], select the ARMA model, and introduce the implied volatility of call options, put options, and their combination. e standard ARMA(p,q) model equation is In the end, we found out when p and q are respectively the 3 and 1, the AIC value of the model reaches the minimum. Similarly, when the implied volatility of tier 1 call options (IV call ) and tier 2 put options (IV put ) is introduced, the AIC value reaches the minimum. erefore, we obtain the ARMA-IV model expression:

Historical Volatility Model (GJR-GARCH Model).
Considering that the traditional GARCH model does not discuss the difference in the impact of the rise and fall of asset returns on its volatility, we choose to use the GJR-GARCH model as the historical volatility model. e GJR-GARCH model is more suitable for biased distribution of time series, can more acutely describe the leverage effect of asset volatility, and further improve the accuracy of forecasting. e conditional variance equation of the GJR-GARCH(p,q) model is Based on the AIC criterion, the parameters p and q in the GJR-GARCH(p,q) model are both set to order 1. So, we can obtain the specific equation: where HVGJR t is the historical volatility (HV) time series based on the GJR-GARCH (1,1) model. e HAR-RV model is based on the high-frequency market data to predict the volatility series, while the GJR-GARCH model is based on the market daily frequency data. e prediction accuracy of the two models may be different.

Volatility Model Based on Market Liquidity and Investor
Sentiment. To test the influence of investor sentiment on the prediction effect of the volatility model, we introduce investor sentiment variables into the HAR-RV model, the ARMA-IV model, and the GJR-GARCH model, and we obtain the volatility model based on investor sentiment (HAR-RV-S model, ARMA-IV-S model, and GJR-GARCH-S model): where D a t is the dummy variable of investor sentiment. In the same way, we introduce market liquidity variables into the HAR-RV model, the ARMA-IV model, and the GJR-GARCH model, and we obtain the volatility model based on market liquidity (HAR-RV-L model, ARMA-IV-L model, and GJR-GARCH-L model): where D b t is the dummy variable of market liquidity. Finally, we introduce market liquidity and investor sentiment into the HAR-RV model, the ARMA-IV model, and the GJR-GARCH model, and we obtain the volatility model based on market liquidity and investor sentiment (HAR-RV-S-L model, ARMA-IV-S-L model, and GJR-GARCH-S-L model):

In-Sample Prediction
Results. In this part, we will first make an in-sample forecast of the actual volatility of the next day. Considering that the heteroscedasticity and autocorrelation of the sample time series data significantly reduce the validity of the model results, we use the Newey-West's ttest to adjust the model results. e in-sample prediction and comparison of the above model parameter estimation and results are shown in Table 3. Table 3 shows the in-sample prediction results of the HAR-RV model. It is found that all the parameter estimation results of the HAR-RV model are significantly positive at the level of 1%. In the HAR-RV-S model, the estimation results of interaction coefficient after introducing investor sentiment are significantly positive at 5% level, and the prediction accuracy is improved by 1.2% base on the Adj.R 2 . In the HAR-RV-L model, the prediction accuracy after the introduction of market liquidity is increased by 0.4%, and the estimated result of the interaction coefficient of short-term volatility is significantly negative at the level of 5%, while the estimation of the coefficient of interaction under long-term volatility is not significant. In the HAR-RV-S-L model, the prediction accuracy is significantly improved by 1.7% after the introduction of investor sentiment and market liquidity at the same time, and the estimated results of short-term volatility investor sentiment and market liquidity interaction coefficient are significantly negative at 5% level, while the long-term volatility investor sentiment interaction coefficient is significantly positive at 5% level, but the estimated result of liquidity coefficient is not significant. e prediction results in Table 3 show that the realized volatility model can effectively predict the actual volatility, and this prediction ability is affected by high investor sentiment and low market liquidity. After introducing investor sentiment and market liquidity at the same time, the effect of high investor sentiment in predicting long-term volatility is significantly reduced, indicating that the predictive effect of high investor sentiment depends in part on the impact of market liquidity. Table 4 shows the in-sample prediction results of the ARMA-IV model. It is found that most of the parameter estimation results of the ARMA-IV model are significant at the level of 5%, and the prediction accuracy of implied volatility based on the combination of call and put options is the highest, while the estimation results of investor sentiment and market liquidity coefficient introduced separately are not significant.
is shows that the implied volatility model can partially effectively predict the actual volatility, but this predictive ability is not significantly affected by investor sentiment and market liquidity.    , and * * * indicate that the regression coefficients are significant at the levels of 10%, 5%, and 1%, respectively. Table 5 shows the in-sample prediction results of the GJR-GARCH model. It is found that the parameter estimation results of the GJR-GARCH model are significant at the 1% level, but the level of prediction accuracy is low, and the prediction effect of the GJR-GARCH model is poor. e estimation results of the coefficients of investor sentiment and market liquidity introduced separately are not significant, and the forecast accuracy basically does not change significantly. is shows that the historical volatility model cannot effectively predict the actual volatility, while investor sentiment and market liquidity have no significant impact on the GJR-GARCH model.

Journal of Healthcare Engineering
In summary, the effect of in-sample forecasts of realized volatility is better, and the effect of in-sample forecasts of implied volatility is better than historical volatility. Market liquidity performance significantly improves the forecasting ability of realized volatility and implied volatility, indicating that in-sample forecasts, market liquidity contains a certain amount of incremental information. Investor sentiment also has a positive impact on the forecast of realized volatility and historical volatility, but it partially overlaps with the incremental information of market liquidity. At the same time, the prediction accuracy of introducing both is sometimes lower than that of introducing a single variable. Table 6 shows the outof-sample prediction results of all models.

Out-of-Sample Prediction Results.
(1) e out-of-sample prediction effect of short-term volatility is generally better than that of long-term volatility, indicating that the model captures the short-term volatility characteristics of the market better than long-term volatility.
(2) e out-of-sample prediction effect of the combined implied volatility is stronger than the implied volatility of a single option type, indicating that call and put options have partial independent market internal information in predicting volatility. (3) Compared with the implied volatility and the realized volatility, the historical volatility has a poorer  out-of-sample forecast of the actual volatility, and the ineffectiveness of this prediction may be due to the fact that the SSE 50ETF market is in the early stage of development, while historical transaction data still contains a lot of noise. (4) e separate introduction of market liquidity and investor sentiment prediction models will significantly improve the prediction and interpretation capabilities, and compared to investor sentiment, the out-of-sample prediction of market liquidity alone has a better effect. (5) e effect of introducing long-term out-of-sample volatility of market liquidity and investor sentiment at the same time is not as good as the effect of introducing market liquidity separately, but the effect of introducing two indicators at the same time in the short-term volatility out-of-sample forecast is the best, which shows that short-term market liquidity and investor sentiment contain more independent internal market information than long-term.

Conclusion
Based on the results of empirical analysis, we draw the following conclusions: (1) e realized volatility model has the best predictive effect on future volatility, while the prediction effect of historical volatility and implied volatility model is not ideal, which indicates that the realized volatility calculated based on high frequency data contains more market intrinsic information. Although the implicit information of daily frequency options and historical trading data also contain some information, it may be because there is a lot of noise in daily frequency data, and the SSE 50ETF options market is in the early stage of development with insufficient liquidity, so the forecasting effect is not ideal. (2) ere are significant differences in the effectiveness of the long-term (one-week) and the short-term (one-day) volatility in different forecasting cycles. Compared with the long-term (one-week) volatility, the effect of predicting the short-term (one-day) volatility is better. is may be because the proportion of individual investors in the Chinese stock market is too large, leading to frequent short-term market transactions and making market asset prices contain more information.
(3) e introduction of market liquidity and investor sentiment health can improve the predictive ability of the volatility model, but the liquidity improvement effect is better. At the same time, the shortterm volatility prediction effect of introducing liquidity and investor sentiment is poor, while the prediction effect of long-term volatility is more ideal, which shows that both liquidity and investor sentiment have the ability to explain market volatility. Compared with the short-term incremental information, long-term liquidity and investor sentiment may contain more independent internal market information.
e empirical evidence of this article has important enlightening significance for the relevant government departments and investors to understand the impact of investor sentiment health and market liquidity on market volatility. e inclusion of different volatility and various indicators in the forecasting model provides a new idea for the forecasting research of volatility and also provides a new direction for the formulation of public policies. On the one hand, relevant government departments should pay more attention to the impact of investor sentiment health and market liquidity on market volatility, formulate more high-frequency and effective market monitoring mechanisms, and focus on market volatility when extreme investor sentiment and liquidity occur, and appropriate policy intervention should be carried out when necessary to prevent frequent surges and falls in market prices. On the other hand, under the background that China's options market has been in the primary stage of rapid development for a long time, relevant government departments should strongly promote information transparency, actively improve the information disclosure system, and establish an effective options trading market to play a role in the development of the physical industry and the improvement of organizational performance.
Data Availability e datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.